Regular Expression Search Operators
Search Operators Index | |
Basic Operators |
Sub-Expression Operators |
Related Topics |
The following characters have special meaning in regular expression search terms. If you wish to use the literal of one of these characters, preface them with the \ character.
- + * ? ( ) [ ] \ | ^ $ !
For example, to search for the \ character, you must use a double it (\\) in order for \ to be matched correctly when using Regular Expressions. See Regular Expression Literals for more information.
You may wish to increase the "Maximum Regular Expression Size" under Search Options if you have a regular expression search that involves Binary Characters and/or is very complex.
Please See Word Documents Notes for important information on search & replaces in non-text files such as Word documents, Word Perfect documents, spreadsheets, etc.
Regular Expression Search - Match Operators | |||
* |
Zero or More Operator: Matches zero or more expressions enclosed in () or []. * may be used by itself, although it is intended to be used around strings. If the * operator is entered alone it will match all characters from the start of the line to the end of the line. You can match characters between two or more strings up to the maximum regular expression size by specifying a range after the * operator. Entering several expressions in a row containing * should be done carefully to avoid overlapping matches which may produce unpredictable results. Note: By design, * does not match characters under 'space' - ASCII 32 or 20 hex. If you need to search for a low order characters use the form *[] or *[\0x-00- ] | ||
|
*(is) |
matches |
is, Mississippi |
|
Note: When * is combined with a numeric range and the %n>> or %n>starting value> replacement operators, the search expression above, Windows *[0-9], would be part of a Regular Expression Counter Operation. | ||
|
Note: By design, * alone does not match characters under 'space' - ASCII 32 or 20 hex. If you need to search for all possible characters using *, use the form *[] or *[\0x-00- ]. |
+ |
One Or More Operator: Matches one or more of the occurrences of the expression. + is intended to be combined with () and care should be taken when using + by itself. For example, | ||
|
+(is) |
matches |
is, Mississippi |
|
Note: By design, + alone does not match characters under 'space' - ASCII 32 or 20 hex. If you need to search for all possible characters using +, including low order characters, use the form +[] or +[\0x-00- ]. |
? |
One Occurrence Operator: Matches exactly any character either before or after a string. ? also matches any character between two strings. When combined with (), ? matches exactly one expression enclosed in (). Using the ? operator by itself will match every character in a file one at a time and therefore probably should be avoided. | ||
|
?(is) |
matches |
is |
|
Note: By design, ? alone does not match characters under 'space' - ASCII 32 or 20 hex. If you need to search all possible single characters using ?, including low order characters, use the form ?[] or ?[\0x-00- ]. |
| |
Or Operator: Matches the simple expression either before or after the | (pipe) symbol. This should be used in conjunction with (). Or expressions should not contain other operators such as *+^$?. You may, however, make use of other operators outside the (). For example, | ||
|
(01/|02/)+[0-9](/95|/98) |
matches |
01/15/98 & 02/12/98 |
! |
Not Operator: A match will be made when both a 'positive' hit component and a !() or ![] component of the expression are found. The complete expression requires both components. The first may be as simple as a single regular expression operator such as * or ?. You should provide a wild card operator of some type prior to the ! component. The ! component should be enclosed in () or []. Be sure to nest ( ) when using |, e.g., ?!(a|b) won't work - Use ?!((a|b)) instead. Multiple!() component can be used to create an 'or', e.g, ?!(a)!(b)!(c). You can also use other regular expressions inside ( ). Additional 'postive hit' strings &/or regular expressions to find may be specified after !() or ![]. Note, however, that regular expressions following the !() or ![] will not be available to the %n operators. See Not Operator Notes for more information. | ||
|
?at!((b|c)at) |
matches |
mat & sat but not 'b'at or 'c'at |
^ |
Beginning Of Line Operator: Matches an expression at the beginning of a line. ^ should be the first character in your search term. ^ is best thought of as an 'anchor' - it anchors the entire expression to the start of a line. ^ can be combined with other wildcard and operators, with the following qualifications: - Only one ^ can be present in an expression. If you need to consider two 'beginning of line' terms, use line boundary characters (\r\n) as literals as in the example below. - A search such as, \r\nFind this, is the same thing as ^Find this (if your files are PC format). - ^ can be used in 'not' expressions but do not use ^ inside () expressions. Use *(\r)\n instead. - Do not use ^ and $ in a single expression. If you need to anchor a search to the start and end of a line, use literal line boundary characters at the end of your term. For example, if your files are PC format, use something like ^find this as the only thing on a line\r\n. If you are making a replace, include \r\n in your replace string so you don't strip out the line boundary characters. - The ^, $, ^^, and $$ operators are counted for the purposes of an %n operator in a replacement expression. For example, in the search expression ^+[ ][a-zA-Z], the corresponding %n terms are: %1 = ^, %2= +[ ], %3 = [a-zA-Z]. - A Trick/Tip: During replacements, Search and Replace assumes ^ in the replacement term so it is often not necessary to reference specifically reference ^ in your replacement string. For example, an operation to remove the first character from each line could use: Some other examples of ^ are: | ||
|
^the |
matches |
the, The, THE, tHE at the beginning of a line |
|
^*( )BEnd\r\n*( )Exit Function |
matches |
<space(s)>BEnd <immediately followed on next line by> <space(s)>Exit Function |
|
^the*end.\r\n |
matches |
And entire line that begins with The and ends with end. |
$ |
End Of Line Operator: The $ operator is similar to the ^ operator but anchors your search to the end of a line. $ can be use with other wildcard and subexpression operators with the following qualifications: - Only one $ can be present in an expression. If you need to anchor a search to two line ends, use line boundary characters (\r\n) as literals. See below example. - These two search terms are the same: FindThis$ FindThis\r\n (PC format files) - $ can be used in 'not' expressions but do not use $ inside () expressions. Use *(\r)\n instead. - Do not use ^ and $ in a single expression. If you need to anchor a search to the start and end of a line, use literal line boundary characters at the end of your term. For example, if your files are PC format, use something like ^find this as the only thing on a line\r\n. If you are making a replace, include \r\n in your replace string so you don't strip out the line boundary characters. - Note: The ^, $, ^^, and $$ operators are counted for the purposes of an %n operator in a replacement expression. For example, in the search expression l+[ls]$, corresponding %n terms are: %1 = +[ls], %2= $. Some examples of $ are: | ||
|
end$ |
matches |
end only if it is at the end of a line |
^^ |
Beginning Of File Operator: Matches an expression found at the beginning of a file. Usage is similar to ^. Do not use ^^ inside (). Note: The ^, $, ^^, and $$ operators are counted for the purposes of an %n operator in a replacement expression. For example, in the search expression ^^?omething, the corresponding %n terms are: %1 = ^^, %2= ?. | ||
|
^^First |
matches |
First in "First line of the file" if that string is on the first |
$$ |
End Of File Operator: Matches an expression found the end of the file. Usage is similar to $. Do not use $$ inside () Note: The ^, $, ^^, and $$ operators are counted for the purposes of an %n operator in a replacement expression. For example, in the search expression in the below, *$$, the corresponding %n terms are: %1 = *, %2= $$. | ||
|
*$$ |
matches |
The last line in the file |
Regular Expression Search - Sub-Expression Operators | |||
[ ] |
Range Operator: This may be a list of single characters such as [gdo], one or more ranges of characters such as [d-o0-2], or a more complex expression using other match or sub-expression operators such as do[g|uble]. Ranges using an "a-z" type of notation and are parsed in the order of the table of characters in the Binary Mode - Binary Codes list. If you need to include the - character as a specific character, make it a literal by specifying \-. Use the ?, *, or + operators to modify the range to be matched by the [] sub-expression operator. When nothing is specified inside the brackets, [] matches all characters and is equivalent to ?[]. Be careful if you specify [] as the only string to search for -- it will match all characters in the file, one at a time. The term *[] spans across one or more lines up to the number of characters specified by Options-Search: Maximum Regular Expression Size. *[] is very useful for 'finding anything' between two other component of your search term. If ?, *, or + are not specified, expressions that use a range in [ ] match single characters. This is the same as specifying ? | ||
|
t[]e |
matches |
The, Toe |
|
Note: When [] is combined with a numeric range, the * operator, and the %n>> or %n>starting value> replacement operators, a search expression such as Windows *[0-9] would be part of a Regular Expression Counter Operation. |
( ) |
Subexpression Operator: Parentheses are used to denote one or more sub-expressions. This is usually combined with the OR operator - |. For example, | ||
|
Win( 95|dows 95) |
matches |
Windows 95, Win 95 |
+n |
Column Specifier: This is used to denotes the number of columns to match either before or after an expression. The Column Specifier may be used with a simple search term such as the expression +4The or in combination with other the [] or () sub-expression operators. A range of columns to match may also be specified, such as The+4-10. Note that you should combine the + operator with [] or () if you want to be clear about literal strings that serve as an anchor to the expression. Some examples: | ||
|
w+2[a-z] |
matches |
Wor in Hello World |
Regular Expression Search - Special Literal Characters | |
- + * ? ( ) [ ] \ | $ ^ ! |
If you wish to search for any of these characters, they must be preceded by the \ character to be interpreted as a literal in a search. |